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Abstract 

In this paper we study opportunistic transmission strategies for cognitive radios (CR) in which 
causal noisy observation from a primary user(s) (PU) state is available. PU is assumed to be operating 
in a slotted manner, according to a two-state Markov model. The objective is to maximize utilization 
ratio (UR), i.e., relative number of the PU-idle slots that are used by CR, subject to interference ratio 
(IR), i.e., relative number of the PU-active slots that are used by CR, below a certain level. We introduce 
an a-posteriori LLR-based cognitive transmission strategy and show that this strategy is optimum in the 
sense of maximizing UR given a certain maximum allowed IR. Two methods for calculating threshold 
for this strategy in practical situations are presented. One of them performs well in higher SNRs but 
might have too large IR at low SNRs and low PU activity levels, and the other is proven to never violate 
the allowed IR at the price of a reduced UR. In addition, an upper-bound for the UR of any CR strategy 
operating in the presence of Markovian PU is presented. Simulation results have shown a more than 
116% improvement in UR at SNR of -3dB and IR level of 10% with PU state estimation. Thus, this 
opportunistic CR mechanism possesses a high potential in practical scenarios in which there exists no 
information about true states of PU. 



I. INTRODUCTION 

The limited availability of radio spectrum, together with the ever increasing demands for 
data rates, has created a big challenge for spectrum regulators, manufacturers and operators as 
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they need to meet the demand. Modulation and coding are approaching the Shannon limits, 
which makes the higher spectral efficiencies theoretically impossible fT\. On the other hand, 
the hardware impairments including but not limited to power amplifiers nonlinearities, analog to 
digital conversion issues and phase noise limit the efficient use of frequency bands. Although 
the usable spectrum is limited. Federal Communications Commission (FCC) studies have shown 
that the spectrum is severely underutilized |(2l. More specifically, studies have shown that the 
utilization of the spectrum in different geographical areas varies significantly. For example, 
fading in primary wireless channels creates spatial spectrum holes which can be exploited by 
secondary users [|3]|, BUl. The introduction of software defined radio is an enabling technology for 
the dynamic spectrum access 01,1151, which motivates the reuse of the unhindered spectrum. The 
concept of cognitive radio (CR) as defined first by J. Mitola [5j entails that the communication 
devices adapt themselves to the spectrum 

In the context of CR, spectrum sensing plays a crucial role for the cognition phase. Since 
the spectrum sensing is affected by the type of signal detectors e.g., energy detectors, match 
filter detectors, cyclostationary feature detectors, wavelet feature detectors, etc., the measure of 
the performance of a CR is normally based on the performance of its spectrum sensor 
Usually, detectors and spectrum sensing algorithms are characterized by their probabilities of 
mis-detection and false-alarm [|7]| |l6l. However, the obvious choice of using these probabilities 
might not the best choice to serve the purpose of cognition and adaptation of CRs. These two 
probabilities carry information only about a detector and not the interaction between the primary 
user of the band and CR transmission strategies. Some researchers approached performance 
evaluation of CRs from the capacity point of view (HI, which is valid with a sophisticated 
channel code and a large block length (delay). Thus, a need for proper measures for evaluating 
the performance of cognitive radios (networks) emerges. 

In the traditional implementations of CR, in which only the current sensed received signal is 
considered for the transmission decision in the succeeding time slots, the important fact that the 
PU traffic might be according to a certain model is ignored. CR also expects that its observation 
resembles the true transmission state of PU, and PU will not change its state in the period of 



November 2, 201 1 



DRAFT 



3 

CR transmission. Clearly, since this CR does not incorporate the PU transmission model in its 
decision, the performance of CR will improve if the CR decision algorithm includes such a 
model. This will require a beyond-PHY or cross-layer design. Thus, integrating the PU model 
into CR transmission strategy will enable CR to have credible prediction of PU states. 

In information theory literature, normally it is assumed that the CR(s) have non-causal informa- 
tion about PU(s) activities through a process called genie BUl. However, in practical applications 
this assumption does not hold. Many researchers use only the current state of PU for transmission 
in the slot. 

In addition, CRs suffer from other problems. The capabilities of CRs utilizing energy detection 
spectrum sensing is limited by the SNR wall [fTOl . This is due to the low received power of 
the PU signal at the CR receiver and uncertainties in signals, noise, and channels. This effect 
is more visible [fTTIl . [|T2l in wideband spectrum sensing in particular. This can ultimately result 
in large sensing delays. Nevertheless, spectrum opportunities appear and disappear quickly, and 
they depend on the occupancies in different bands. A real cognitive radio, which, according to 
the cognitive cycle lfT3l should adapt itself to the dynamics of the spectrum, needs to be 
agile to react to the changes in the spectrum [il4| as fast as possible. On the other hand, in some 
cases such as energy detectors, agility compromises the accuracy of sensing the spectrum, which 
ultimately jeopardizes not only interference level made for PU but also reduces the spectrum 
reuse. Thus, a CR which can optimally incorporate all previous observations and thus decides 
for transmission within a short time is appealing. Sequential spectrum sensing has been proven 
to be on average faster than traditional energy detection IfTSl - lfTTl . However, since detection 
time varies in sequential detection, it is not a good candidate for slotted CR strategy. 

In this manuscript we deploy a hidden Markov model (HMM) to form a framework for 
modeling the behavior of CRs in the presence of PUs and all the uncertainties. Additionally, 
a benchmark for evaluation of CR performance is introduced. Then, using this foundation and 
these measures, a new CR transmission strategy is designed and implemented. This new design 
ensures that the vacant spectrum is optimally used conditioned on the level of interference for 
the PU, because of all uncertainties in the model, is not exceeding a certain level. 
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HMMs are long in use for modelling different phenomena ranging from speech signals IfTSl to 
the complex behavior of computer networks. In the context of cognitive radio, many researchers 
model the spectrum white space with Markov models and spectrum sensing using HMMs |fT9l - 
[|25l . In our paper, HMMs are used not only for spectrum sensing but also as a tool for CR 
transmission strategy making. The closest published approach to our method is presented in [26J, 
[|27l . which employs a partially observed Markov decision process. They used this process for 
optimal policy making for multiple channel sensing and access. The approach is similar to ours 
due to the Markovian assumption for the PU transmission model and in the presence of sensing 
errors. However, the sensing model, performance metric, and constraints are different from ours. 

To summarize the contributions of this paper following items can be enlisted 

• A new performance measure for characterizing CR performance is introduced 

• A novel APP-LLR based opportunistic spectrum reutilization strategy is proposed 

• Optimality of this new strategy is proved 

• Two practical methods for calculating threshold for APP-LLR based strategy are introduced, 
one is suitable for high-SNR regimes and presents close to optimum URs but IR may be 
too high at low SNR. The other never violates the allowed IR level, but with a reduced UR, 

• An upper bound on the UR for any CR transmission strategy is established. 

II. System model 

This section presents the model which accounts for the PU signal and noise. First, a more 
general perspective is demonstrated and then a simplified version will be used. 

A. Complete PU transmission model 

A cognitive radio system is designed to utilize the spectrum vacancies. To take advantage of 
time-frequency slots which are not used by the PU, the CR must be aware of the PU activities. 
In this paper, it is assumed that the CR has a full buffer to reuse the spectrum whenever it is 
available. 

CR will receive the PU signal which is attenuated by the PU-CR channel. If there exists more 
than one PU in the vicinity of the CR, the aggregated signal will be received by the CR antenna. 
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It is possible to assume that the PUs operating in the same frequency band and co-located, belong 
to the same network and thus from the CR point of view can be modelled as a single entity. 
Since protection of each one of the PUs is as important as the others, a network of PUs for CR 
can be represented by a single but more active PU, although this would yield a suboptimal CR 
performance compared with a multi-PU model. 

Another factor in modelling the PU-CR interaction is the channel in between. Wireless chan- 
nels are normally considered as random fading processes such as Rayleigh, Rician, Nakagami, 
etc [|28l . |[29l . For simplicity it can be assumed that the fading gain is constant and known 
during the operation of this CR. Another approach to model the fading process is to include the 
fading in the PU transmission model. Thus, whenever channel is in a deep fade, it is assumed 
that there is no PU transmission, no matter what the real state of the PU is. And in case of no 
deep fade, the standard PU transmission model will be deployed. With this brief introduction, a 
simple two-state Markov model can approximate a wide range of PU transmissions, PU network 
activities and even fading channels. In the next section, the simplified two-state Markov model 
will be presented as the PU transmission model. 

B. Simplified PU transmission model 

Now, the PU transmissions are assumed to be slotted, since in most of today's digital com- 
munication systems transmissions are confined within a packet, frame or generally some block 
structure of some minimum length Tp. However, the CR is expecting PU activities and vacancies 
in much smaller slots of length T ^ Tp. Smaller slot size improves the agility of CR to adapt its 
transmission to the PU activity. For the sake of simplicity, we will assume that the CR slots are 
synchronized to the PU slots. However, because of the small CR slot length in comparison to the 
PU slot length, mismatches in synchronization will not cause major performance degradation. 
The existence of a PU transmission in slot k i.e., during time t G [fcT, {k+ 1)T), is denoted by the 
hypothesis Hi = {q^ = 1} and its absence is denoted by Hq = {q^ = 0}. A simple model which 
represents the PU transmission is the two-state on-off Markov process depicted in Fig. [H where 
the Markov chain is represented by the transition probabilities = Pr{gfc+i = j\Qk = i} > 
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Fig. 1. PU transmission model 



for i,j G {0, 1} and stands for the PU state at time slot k. The transition matrix is 



A 



^00 + (^01 — o-w + (^11 — 1- 



flOO 0,01 

The initial distribution of the states is assumed to be in a steady state [fT8B and defined as 



(1) 











TTo TTi 





Pr{g, = 0} Fr{qk = l} 



A; = 0,l,2,--- (2) 



£101 +'^10 aoi+QlO 

It is assumed that the PU activities happen with a period Tp larger than the period CR Markov 
chain operating on T. Thus, the chance of staying in one state or another is much higher than 
the chance of transition between states. This allows us to assume that aoi +aio < 1, which turns 
out to be useful in Section I 



C. Signal and noise model 

The receiver front end is an energy detector whose output is yk = J2f=o^ k (^^ + ^^s) l^^ where 
r(-) is the complex envelope of the received signal low-pass filtered to the PU signal bandwidth 
W, T is the period in which energy is collected, Tg is the sampling time, and K is the total 
number of samples in each period. We assume that the received PU signal can be modelled as 
a Gaussian random process. The Gaussian PU signal model is common in literature |[30l flU, 
and is reasonable for many combinations of PU signal formats and channels (fading as well as 
nonfading). If we select Tg such that Tg ^ l/W, then the samples r{iTs) are approximately 
statistically independent. We note that K is constrained as K < T /Tg. 

Since noise and channel uncertainty exists in the CR observation of the PU signal, the true 
PU state from Fig. [T] is not observable. Depending on the state of the PU a continuous energy 
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Fig. 2. Continuous-output HMM of received signal at CR 



level which consists of noise only or signal plus noise is observed. This model corresponds to 
a continuous-output HMM depicted in Fig. [21 

1) Noise only: In state Hq, the noise n{iTs) ~ CA/'(0, ctq) is a zero-mean complex circular 
Gaussian {CM stands for complex circular Gaussian) sample with variance ctq, and the received 
signal will be r{iTs) = n{iTs). Thus, yu is chi-square distributed with 2K degrees of freedom 
and Gaussian variance crQ/2. 

2) Signal plus noise: In state Hi, the noise is a zero-mean complex circular Gaussian sample 
with variance Uq, the signal is also zero-mean complex circular Gaussian with variance a^, and 
r{iTs) = s{iTs) + n{iTs), r{iTs) ~ CA/'(0,(t^), where al = + cig. Thus, yt is chi-square 
distributed with 2K degrees of freedom and Gaussian variance cr^/2. 

III. Problem statement and performance metrics 

Cognitive radios exploit channel availability information from spectrum sensing and decide 
whether to transmit or not. In this paper we assume that the CR has a full buffer to transmit. Thus, 
it would like to take advantage of any spectral opportunities and transmit whenever possible. 
However, due to channel and noise uncertainties it will create unintentional interference for 
PU. Our goal is to design the best CR transmission strategy denoted by Uk+i, where Uk+i = 
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and Uk+i = 1 represent no transmission and transmission, respectively in slot k + 1 using the 
observations until time k, = [yi,y2, ■ ■ ■ ,yk]^- This strategy is supposed to not interfere with 
the PU more than specific limit. 

The performance of a CR is usually assessed based on its spectrum-sensing algorithm. Spec- 
trum sensing is judged based on its probability of false-alarm PpA and probability of mis-detection 
Pm, which are normally presented in receiver operating characteristic plots. However, the ultimate 
goal of CRs is to reutilize the idle spectrum slots while keeping the level of interference for PUs 
below a certain level. The two aforementioned measures are not taking the PU behavior into 
account. Moreover, utilization and interference are defined by the presence or absence of PU 
transmission. Therefore, it is advantageous to define new criteria which consider the full picture 
including PUs, CRs, and even the channel. 

A. Definitions 

Interference will happen whenever the CR transmits at the same time as the PU. Thus, the 
interference ratio (IR) p is defined as [[3T|| 



Utilization of the spectrum occurs whenever the CR transmits in a vacant time-frequency slot. 
Thus, we define the spectral utilization ratio (UR) as 



The intention of any CR is to design a strategy that keeps p below a specified level, say Pmax^ 
and then maximizes the utilization ratio t]. Hence, we call a transmission scheme that maximizes 
r] while p < p^ax an optimal transmission scheme for a given agi and aiQ. The relation of UR and 
IR to the transmission rate and the probability of error of CR appeared in [31J. The following 
theorem states that the UR and IR depends on the PU A, Pq = Fi{uk+i = 0\qk = 0} and 



p = Pr{nfc+i = l|gfc+i = !}• 



(3) 



T] = Pr{Mfc+i = l|gA:+l = 0}. 



(4) 



Pi = Pr{'Ufc+i 



l\qk = l}. 
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Theorem 1: Assume that PU follows the Markov model presented in Fig. [U For any CR, the 
UR and the IR are given by 

T] = aoiPi + aoo(l - ^^o), (5) 
p = aiiPi + aio(l - Po)- (6) 
Proof: Proof for a similar theorem is presented in BU Th. 1]. ■ 

Remark If we set Uk+i = Qk, where qk is an estimate of PU state qk and " denotes negation, 
Po is the false-alarm probability and Pi is the probability of missed detection for qk. 

B. Bound for the performance of cognitive transmission strategies 
Theorem 2: For any CR that satisfies p < pmax, 

Pmax + (1 - aoi - aio) min{^, ^^}, if aoi + a^o < 1; 

Pmax - (1 - aoi - aio) min{^, i^}, if aoi + aio > 1, 
Proof: Eliminating 1 — Pq from ([5]) and ^ yields 

V = P+ [P-Pi)- (8) 

The feasible range of Pi can be calculated from Q, < Pq < 1 and < Pi < 1 as 
max{0, } < Pi < min{l, If aoi + aio < 1, then r] can be upperbounded by substituting 
the lower bound on Pi and p < p^ax in ([8]), which yields the first line of ([7]). Similarly, if 
Ooi + «io > 1» then the second line of dV]) is obtained from ^ and the upper bound on Pi. ■ 

Corollary 1: 7]^^ > p^^^. 

IV. Energy detection as baseline CR strategy 

Energy detection, which is one of the most widely deployed spectrum sensing methods because 
of its simplicity, compares the estimated received energy {yk) with a threshold to detect the 
existence or absence of the PU signal. Using this threshold at a certain received PU signal 
power to CR signal-to-noise ratio (SNR) will result in certain probabilities of mis-detection and 
false alarm. This procedure is modelled in the HMM presented in Fig. [3l In this model, = 
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l-Pu 



Fig. 3. HMM model for the energy detector. 

and gfc = 1 denote the detected state to be Hq and Hi, respectively, and thus Qk = if yk < Oe 
or gfc = 1 if Uk > Oe, where Oe is detection threshold. Thus, PpA and Pm are 

^PA = 1--^..|..(W = 1-^^^^|^ (9) 

PM = :F,,\,AOe\i) = ^^Y{Kf ^^^^ 

where F is the Gamma function, 7 is the lower incomplete Gamma function, J^yt,\qt,{-\0) and 
J^yf,\gf,i-\l) are the cumulative distribution function (CDF) of a chi-square distribution with 2K 
degrees of freedom and Gaussian variance cro/2 and (jI/2, respectively. 

We will use Uk+i = % as the baseline transmission strategy. The threshold Oe, that maximizes 
UR, is calculated by recalling that Pq = Pp^, Pi = Pm and combining expressions ^ and 
(fTOl) . substituting p = p^ax and solving them for Og. 

V. A-POSTERIORI PROBABILITIES LLR BASED COGNITIVE RADIO 

One reasonable way to incorporate both the model and the entire observation is to form the 
a-posterior probability of Pr{gfc+i = l|yfc}. This probability will be used in the decision rule as 

{1, if Zk< 6'llr 
(11) 
0, if Zk > ^LLR 

where Zk = log p|!|g^^|~Qjy^| and 6'llr are the a posteriori log-likelihood ratio and the threshold 
for Zk, respectively. The Zk, which is used for estimating the future state of PU, hereafter will 
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be addressed as the LLR. Thus, with the same method explained in ||3T1 eqs. 18-19], the LLR 

as a function of the forward variables a^ij) = Pr{gfe = j, y^}, j G {0, 1}, which are computed 

recursively [[TSl eqs. 19-21] with moderate complexity, is derived as 

aoiafc(O) + aiiafc(l) 

Zk = log (12) 

aooafc(O) + aioaA:(l) 

In our previous paper [|3T]| the forward variables were calculated based on the discrete output 
HMM. However, the forward variables can be calculated based on the continuous-output HMM 
presented in Fig. [2l There are several benefits in doing the latter. The baseline method in 
Section |IV] needs a threshold to be calculated while the continuous model does not need such 
a threshold. This thresholding might reduce the information available in the samples from the 
continuous-output HMM. Since both pi{9llr) and r7i(^^LLR) are nondecreasing functions of 6'llr, 
it follows that the optimum threshold, which does not cause more interference than the allowed 
Pmax and maximizes the UR, is found from Q as 

^LLR = -^4,^,(Pmax|l), (13) 

where is the inverse CDF of Zk conditioned on qk+i = 1. 

In the case that the PU transition matrix in © is time-variant, semi-Markov models can be 
used instead of the model in Fig. \T\ For hidden semi-Markov models, forward variables can be 
calculated [|32fl and thus the same method can be deployed. 

A. Optimality of the LLR based cognitive radio 

Theorem 3: The a-posteriori LLR-based cognitive transmission scheme presented in (fTTI) is 
the optimum strategy in terms of maximizing UR subject to p ^ Pmax- 

Proof: The proof is inspired from the proof of the Neyman-Pearson Lemma [|33||. To prove 
the theorem, it should be shown that for any other strategy A, which has % and pa < Pmax, the 
LLR-based strategy has higher UR t/llr > Va with the condition on pllr = Pmax- The set of 
observations for which the CR decides to transmit is denoted by R. Thus, for LLR-based 
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Strategy set -Rllr is defined as 
^[lr = 

where fY^lgk+i distribution of observations given next PU state. The IR and UR can be 

written as 

p = Pr{Y,Gi?|g,+i = l}= / /Y,|,,+,(y|lMy, r/ = Pr{YfcGi?|gfc+i = 0}= / /Y,|,,^,(y|0)t/y. 

Jr J r 

(14) 

From law of total probability it can be shown that 

i?A = (i?A n i?LLR) U (/?A n i^^LR), ^LLR = (i?A H J^lLr) U {Rl D /^lLr), (15) 

where W denotes the complement set of R. Since the components of the union are disjoint events, 
the probability that an observation belongs to a set can be written as the sum of the components. 
Thus, to show that t/llr > Va, it is enough to show that Pr { G Rj^ H RLLR\qk+i = 0} > 
Pr{Yfc G -Ra I~I -^LLR^fc+i = 0}. To prove the theorem, starting from the left side, it can be 
written 

Pr{Y,Gi?lni?LLR|g.+i = 0}= / /Y,|^^^^(y|o)dy>-^ / /Y,i,,+,(y|lMy 

JRlnRLLR ^LLR JR%nRLLR 

- ^ PrXV tiR^nP \n - 1 -1 - ^LLR ' p' _ Pmax - p' pA - p' 

- — Pr| ifc e IlitLLRlgfc+l - i| - — ai - ^ 

'^LLR ^LLR ^LLR "^LLR 



= -^Pr{Y, G i?Ani?£LRkfc+i = ^} = 7U- I /Y,i,,+,(y|i)rfy 

f^LLR f^LLR ^flAnfl^LR 

> / /Y,k,+,(y|0)rfy = Pr {Y, G i?A n Rl^^\qu+i = 0} , (16) 

JRaHRIlr 

where p' = Pr {Y^ G -Ra H -Rllr^Ai+i = !}• The inequality (fT6l) is true since 

y G i?A n /?^LR ^ y e i^^^R ^ ^"^''''""V^^l^i > ^llr ^ ^/Y,i,,,,(y|i)rfy > /Y,i,,,,(y|o). 
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B. Implementation issues 

In this section, the limiting assumptions for using the LLR-based method presented earlier 
are discussed. By carefully looking at the requirements of the LLR-based method, it is apparent 
that for calculating the LLRs knowledge of the hidden Markov model is required. In both cases 
of discrete and continuous-output HMM, the transition matrix A and the SNR are required. 
This paper assumes that this information is available or estimated beforehand. In [18, sec. III- 
C] the Baum- Welch iterative estimation algorithm, which is equivalent to the well-established 
expectation-modification (EM) method, is demonstrated. This method will be used to estimate 
the model parameters from the observations. While examining the performance of the Baum- 
Welch algorithm is beyond the scope of this paper, there exists a vast amount of literature about 
its convergence and performance. 

The second and more challenging issue in the LLR-based method lies in the calculation of 
the threshold in expression (fT3l) . In this expression, there is a need for the knowledge of the PU 
states (or their estimates) for a certain training period to estimate Fz^\q^^^{x\l). This is normally 
done sporadically, but since the true states of PU are not known, they have to be estimated. 
This process can be done for the previous observations; their corresponding PU states can be 
estimated with the forward-backward algorithm ifTSl . Notice that the estimated states of PU 
might not perfectly corresponds to the actual ones due to the uncertainties in the noise and 
channel. This will change the empirical CDF and thus the threshold calculated on which it is 
based. This error in the PU state estimation will depend deeply on the SNR and also on the 
A matrix. The big concern with this error is that it might result in possible violation of the 
maximum allowed IR for the PU (pmax)- However, to have a useful method, robust to changes 
and reductions in SNR, it is necessary to make sure that it will never violate the IR under any 
conditions. In low SNRs in which the PU state estimation might be poor, we can directly use 
unconditional empirical CDF of LLRs which does not need PU state estimation. In Section |Vll 
we proved analytically that the threshold which is calculated based on unconditional CDF of 
LLRs will result in a CR strategy which does not violate IR. 
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VI. Threshold calculation without true PU state knowledge 

The threshold for CR transmission strategy can be calculated based on the expression (fT3]) . 
To do so, the actual PU states are needed to estimate the empirical CDF (ECDF) of LLRs 
conditioned on PU states. This empirical CDF is used for calculating the decision threshold. 
In this paper, we estimate the PU states with the forward-backward algorithm. Notice that the 
scenario where the correct PU states are known is not realistic. 

In this section, we show that, even without knowing the true state of the PU, it is possible 
to find a threshold that will not harm the PU. To prove the existence of such threshold, it is 
sufficient to prove that if the threshold is calculated based on the unconditional empirical CDF, 
the actual IR will not exceed Pmax- This can be shown by proving that the unconditional CDF of 
LLRs (J'zki^)) is always bigger than the CDF of LLRs conditioned on the next PU state being 
one (Vx; J^z^i^) ^ -^2fe|gfc+i(^|l))- This is proved in theoremlH As explained in Section ITl-B[ we 
focus on the case aoi + ctio < 1- The main part of this proof is to show that the empirical CDF 
of the LLRs conditioned on the next PU state being zero is always larger than the CDF of the 
LLRs conditioned on the next PU state being one {^x; J-'z^\g^^^{x\0) > which is 

proved in the same theorem. To show this, first it is shown that the J^2j^|gj^^^(x|0) > J'z^\q^^-^{x\l) 
is equivalent to show that J^Aklgk+ii^l^) > J^A^:\qk+iix\l), where Ak = log Now by inserting 
the expression for calculating the forward variable fTF, eqs. 19-20] the following expression is 
obtained 

afc_i(0)aoi + afc-i(l)aii hiVk) 



A, 




Zk-i + Bk, if /c > 1 

(17) 



Zk-l 



where is the probability distribution function of a the Chi-square random variable with 2K 
degrees of freedom and original Gaussian variance of of /2. Recall that is the noise variance 
and af is the signal plus noise variance af = + af. 

Lemma 1: If A^\q^^^{x\Q) > J^A^\q^^^{x\l) , Vx G M and aoi + aio < 1 then J^Zk\qk+ii.A^) > 
-^2fc|gfc+i(3;|l) for all X in the domain of Zk. 
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Proof: From (fT2l) . we have 



aoi + aii^ ooi + aiie'^^- 

«oo + «io^ «oo + aioe^'= 



ail J- ~ Q-oi ~ c^io 



(18) 



aio ciioiaoo + aioe''* 

Since 1 — aoi — ciw > 0, in (fT8]) . the second term inside the log has a positive nominator and 
denominator, and exponential is an increasing function of A^. Thus, Zk is a monotonic increasing 
function of A^. The lemma follows since the CDFs of A^ and Zk will have the same behaviour. 

■ 

Lemma 2: For yk as defined in Section HTCl and Bk defined in (fTTI) . J^B^lgki^l^) — -^Bklgti^l^) 
for all A; > 1 and x > 0. 

Proof: Starting from derivation of B^, we will have [34. pp. 370] 

= ^^^r7~^ = — I K-i „ /o„2 = 2i^log — + — ( — ^^), 

MVk) „ix2KT(K) yk 'e-2^'=/2-o (Ti 2 alal 

where r(-) represents the Gamma function. Because al > ctq, Bk is a strictly increasing function 
of yk. The lemma now follows because 

^y.\,.m = ° r(^) < ° r(^) = ^..k.(2/|o). 

■ 

Lemma 3: Let Cfc be any stationary random process that conditioned on qk is independent of 
Qk+i- If doi + <^io < 1> then for any 

J'ciMO) > J-c,|,,(x|l) ^ -^c,|<,,+,(x|0) > J-c,|,,^,(a;|l). (19) 

Proof: From the conditional independence in the lemma assumption we have Pr{Cfc < 
x\qk = i, Qk+i = j} = Pr{Cfc < x\qk = i}. Now, for i G {0, 1} and j E {0, 1} 

Pr{Cfc <x,qk = i, qk+i = j} = Pr{Ck < x\qk = i, qk+i = j} Pr{gfc = i, Qk+i = j} 



Pr{Cfc < x\qk+i = j} 



Pr{Cfc < x\qk = i} Pr{gfc+i = j\qk = i} Pr{gfc = i} = J^Ck\qki^\^)(^ij'^i 
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where po = J'ctlgki^l^) ^^^^ pi = J'c^^ig^ix\l) . Now since, by assumption, 1 - aio - a^i = 
Oil — o-oi = cioo ~ o-io > 0, we have that 

•^Cfc|gfc(a;|0) > J^Ck\qki^\'^) ^ Po(aoo - oio) > Pi{au - aoi) 



In Lemma [2l it was proved that J^B^kki^l^) ^ ^^^^ -^fe conditioned on is 

independent of qk+i, which yields the following corollary. 

Corollary 2: If oqi + aio < 1 then J'B^|q^^^(x|0) > I'sklgk+ii^l'^) ■ 

Lemma 4: If J'A^^|g^^j(a;|0) > J^K^\q^^^{x\l) and aoi + aio < 1 then j;^|g^(x|0) > 

Proof: From the assumptions made in this lemma and LemmafU ^Zk\qk+i{A^) — •^Zk\qk+ii^\'^) ■ 

Now, since the fulfils the properties specified for Ck in lemma [3l this lemma follows. ■ 
Lemma 5: If j;^|gj^(x|0) > J^z^\q^{x\l) and aoi + aio < 1 then 

^z^+Bk+,\q^+M^) >^z^+B^ + ,\q^+Ml)- (20) 

Proof: Starting from Lemma[2lwe will have J^Bk+^\qk+i{A^) ^ ^Bk+i\qk+iiA^)- Since the 
states qk form a Markov chain, the dependences between Zk, Bk+i, and qk+2 are depicted as 

• • ■ ^ Qk ^ Qk+i qk+2 . 



Thus, using the chain rule and Markov property, the joint distribution can be written as |I351 
pp. 37-38] 

Y'i{zk + Bk+i < X, qk, qk+i, qk+2} = Pr{gfc} PT{qk+i\qk} PT{qk+2\qk+i} Pr{zk + Bk+i < x\qk, qk+i}- 

(21) 

On the other hand, the CDF of the sum of two independent r.vs A and B can be expressed 
as (Ml pp. 187-190] 

J^A+Bix) = J^a{x) * fsix) = J^b{x) * fA{x), (22) 
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where /a(-) is the PDF of A and * denotes convolution. 

Since Zf^ depends only on and the previous states (and channel noise which is independent 
of the PU states) and Bk+i depends solely on g^+i (and noise), the sum of them conditioned on 
Qk, Qk+i can be written as 

(23) 

To derive both sides of the inequality (|20l) . one should marginalize the joint distribution in 
(I2TI) with respect to qk and g^+i and divide it to the = ^ {0, 1}. After doing that 

and plugging (|23] ) in (|2T]) . for the l.h.s and r.h.s of (|20l) we will have, respectively 

J^Zk+Bk+i\qk+2{A^) = (^lo-^'o * ^0 + aoiaioA^ * Bi + aoiaooA[ * Bq + aoianA[ * Bi, (24) 
j;^+Bfe+i|gfc+2(^|l) = «ioaoo-4.o * Bo + anaioAo * Bi + aioaoi^'i * Bo + q{^X^ * Si, (25) 

where Ai = J^,^\g^{x\i), Bi = J^B,+,\q^+Ax\i), A = fz^lgd^li) and B'i = fB,+,\q,+Ax\i)- 

By multiplying both sides of ^0 > Ai with the positive value 1 — aoi — ctio and rearranging it, 
we obtain aoo-^o + Ooi-^i > o-iqAq + auAi. Now if both sides of this inequality are convolved 
with the positive function B'q, we arrive at (aoo^o + ctoi-^i) * B'q > (aio^o + ctii^i) * B'q. Now 
from (I22I) we can rewrite it as (aoo-^-o + clqiA'i) * Bq > (aio^o + o-nA'^) * Bi where the last 
inequality follows because aio^g + auA'i > and Bq > Bi from Lemma |2l Finally, after 
multiplying both sides of previous inequality with the positive value of 1 — aoi — aio we get 

(aoo^o + aoi^i)(«oo - ctio) * Bq > (aio^Q + auA[){au - ctoi) * Bi ^ 
Ogo^o * Bo + aoiaio^o * -^i + aoiaooA'^ * Bq + aoictii^'i * Bi 
> aioaoo-4.0 * -^o + aiiaio-4.o * Bi + aioaoi-^^ * Bq + al^A[ * Bi,^ 

^ 2fe+Bfc+i|(?fe+2(^|0) ^ ^ 2fc+Bfc+i|gfc+2(^|l)' 

where the last step follows from (l24l) and (|25l) . ■ 

Theorem 4: If 6' = J'7^^(pmax) and aoi + aio < 1, then T^^ig^^^iO' \1) < p^^^. 
Proof: From Lemma[T]J^2^|gj^^^(x|0) > J'zklqk+ii^l'^) the same as proving that J^Afe|gfc+i(a^|0) > 
-^Afc|gfc+i(a;|l). To do so, induction is used. First, J^Ai|q2(^|0) ^ -^Ai|q2(^|l) all x by (fTTI) and 
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corollary [21 Second, Lemma |4] and Lemma[5]show that if J^Afc|gfc+i(a^|0) > J'A^:\q^:+iix\l) for any 
k > 1 and any x then ■FAf^^j^\q^^^2i^\0) > J'A^^_^_-^\q^^^,^{x\l), which completes the induction. Hence 
'^2fc|gfc+i(^|0) — •^2fckfc+i(^|l) ^'^y ^ > 1 and any x. Now from the assumption about pmax 

■ 

Corollary 3: If oqi + ctio < 1 then for LLR-based CR strategy t] > p. 

Thus, the CR strategy with a threshold found based on the unconditional CDF of all LLRs 
protects the PU (p < pmax)- One assumption which has been made in most of the lemmas and 
Theorem in this section the requirement is to have aoi + ctio < 1- Since in the system model 
we assumed that the CR slot length is much smaller than the PU slot length, the probability of 
transition from one state to another will be small. Thus, having aoi + Oio < 1 is not a heavy 
assumption and can be realized easily in practice. 

VII. Performance evaluation and results 

We compare the LLR-based strategy with three different methods for calculating the threshold 
with the classical energy detection based spectrum sensing described in Section |IVl For all of 
these simulations, the same PU Markov model (A) and same level of interference pmax is used. 

The threshold needed for the LLR method is calculated by replacing J^zk\qk+ii^ I 1) (fT3l) 
with an empirical (sample) CDF. The empirical CDF is computed from the set of training data 
Zt = {^1, ^2, • • • , zj^r^}, where Nt is assumed to be large enough such that the empirical CDF 
is a close approximation of the corresponding CDF. In this paper, we compute the empirical 
CDF from one the following three subsets of Z^, 

(i) {zk G Zt : Qk+i = 1}, i-e, when the PU states are assumed to be known 

(ii) {zk G Zt : Qk+i = 1}, where g^+i is the estimated PU states from the forward-backward 
method 

(iii) Zt, i.e., the ECDF is a close estimate of the unconditional CDF of Zk 
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Note that method (i) is unrealistic, while (ii) and (iii) are more practical for calculating the 
threshold. The rest of this section discusses the evaluation setup by which these CRs are assessed. 
It then presents some results and a comparison. 

A. Evaluation setup 

In simulating the performance of a CR transmission strategy, the ratio of received primary 
signal power (at the CR receiver) to the CR receiver noise power is important. For the sake of 
simplicity, we assume one PU link and one CR link. It might be possible to extend it to a case 
with multiple coordinated PUs and multiple coordinated CRs. Moreover, we define the SNR as 
SNR = cTs/ctq (in dB). In this simulation, K is selected to be 10. This parameter plays a role 
for the SNR scaling. The other factor which is important in evaluating CRs is the maximum 
allowable IR pmax- This parameter is normally decided by regulatory bodies like the FCC. In 
simulations, p^ax is chosen to be 10%. The number of elements in Zt is Nt = 5 ■ 10^^. To 
evaluate the performance another 5 • 10^^ slots are simulated. 

B. Results 

The UR and IR of the different CRs are plotted versus SNR in Fig. |4] and [5l The thresholds 
for the LLR-methods are computed using the methods (i), (ii), and (iii) described above. For 
simplicity of the discussion, we assume that all ECDFs are close approximations to the cor- 
responding CDFs. We recall that method (i) gives an optimum threshold (i.e., maximizing UR 
while keeping IR no larger than pmax) and that method (iii) will give a threshold that guarantees 
that IR does not exceed pmax- For method (ii), we have no guarantees for the IR. 

As expected, the UR of method (i) is monotonically increasing with SNR and will approach 
the upper bound ^ for high SNRs and pmax for low SNRs in both Fig. |4] and [51 In all cases, 
the UR of method (i) is greater or equal to that of the baseline method. However, the UR and 
IR curves for methods (ii) and (iii) behave quite differently in Fig. |4] and [51 We note that one 
important difference between the simulation setups is that ttq < ixi in Fig. 4 and ttq > tti in 
Fig. 5, and this will allow us to explain the behavior of methods (ii) and (iii). 
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Let us start with method (ii), which estimates the PU states using the forward-backward method 
in the training phase. In Fig. H the UR is very close to the optimum UR for all considered SNRs 
and the IR is not exceeding pmax- However, in Fig. [51 the performance is close to optimum only 
for high SNRs. For low SNRs, the IR exceeds Pmax^ and the CR is in clear violation of the IR 
requirement. To explain the different low-SNR behaviors, we recall that as the SNR approach 
(in linear scale), the observation yi, . . . ,yN^ becomes irrelevant to the PU state estimation. 
Indeed, as SNR — )■ 0, g^+i converges in probability to 1 if tti > ttq and if tti < ttq, for all 
k = 1,2, . . . , Nt- This implies that {zk G Zt : Qk+i = 1} converges to Zt if vri > ttq and 
if TTi < ttq Hence, if tti < ttq, which is the case in Fig. [51 we expect method (ii) to completely 
fail as the SNR tends to 0. The numerical results in Fig. 5 further indicates that for low SNRs, 
method (ii) will give a too high threshold, resulting in an IR violation (we cannot estimate the 
IR and UR reliably for method (ii) at SNRs below — lOdB with this simulation length, since 
the training set then is empty with high probability). Conversely, if tti > ttq, method (ii) will 
approach method (iii) as the SNR approach 0. This implies that for very low SNRs, method (ii) 
will not result in an IR violation and that the UR will be similar to that of method (iii). This 
reasoning is consistent with the results in Fig. [H 

We can conclude that method (ii) is close to optimum for all SNRs when tti is significantly 
larger than ttq. If tti is significantly smaller than ttq, then the method works close to optimum 
only for SNRs above a certain critical SNR. Below the critical SNR, the method leads to IR 
violations, and the method is therefore invalid in this regime. Continuing with method (iii), we 
recall that the threshold for this method, 9, is such that (9) = p^ax and that the unconditional 
CDF can be written as J-^^ix) — | 0)7ro + 'Fzk\qk+ii.'^ I l)^!- Hence, if ttx — ^ 1 then 

J'^^(x) J^zk\qk+M I 1)' which implies that p^^ax = J^z^i^) ^ ^zu\qk+i{^ I Now, since 
Pmax = ^Zk\qk+ii^* I ^) Satisfied for the optimum threshold, 9*, it follows that the UR of 
method (iii) will be close to optimum. Now, in Fig. [H tti = 0.91 and there will therefore be 
a gap between the UR for method (iii) and the optimum method. Conversely, if ttq — )■ 1 then 
J^zki.^) ^Zk\qk+M I 0)' which implies that p^ax = J^z^i.^) ^Zk\qk+ii.^ I 0) = r/. Hence, 
the UR for method (iii) tends to pmax- In Fig. 5, txq = 0.91 and there is therefore a slight gap 
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Fig. 4. UR (thick lines) and IR (tliin lines) vs. SNR for the baseline CR and corresponding continuous HMM LLR-based CR 

at pmax = 10%, aoi =0.1 and aio = 0.01 



between the UR for method (iii) and pmax- From this we conclude that method (iii) works best 
when TTi is large. For the case when ttq is large, the threshold is too conservative resulting in a 
large UR penalty. However, the IR is never violated and method (iii) is the only practical method 
that is valid for low SNR when ttq is large. 

VIII. Conclusion 

In this paper, we have introduced a framework that models the PU, channel, and CR receiver 
front-end with a simple two-state, continuous-output HMM. The CR transmission strategy can, 
in general, be viewed as computing a decision variable from the HMM output and comparing 
the decision variable with a threshold. Hence, to specify a CR transmission strategy, we need 
only to specify the how to compute the decision variable and how to set the threshold. The 
performance of a transmission strategy is measured by its UR, under the constraint that the IR 
does not exceed pmax- In Theorem [2l we proved an upper bound on the UR, which is a function 
of the HMM model parameters and pmax- Theorem |3] states that the optimum decision variable 
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Fig. 5. UR (thick lines) and IR (thin lines) vs. SNR for the baseline CR and corresponding continuous HMM LLR-based CR 
at pmax = 10%, aoi = 0.01 and aio — 0.1 

is the APP LLR Zk, as defined in (fT2l) . The LLRs can be computed from the forward variables, 
which, in turn, can be computed with moderate complexity [[TSl. Numerical results show that 
using the LLR decision variable gives large gains compared to the baseline method, which is 
based on simple energy detection. The gains are due to the fact that the LLR method make use 
of all past observations of the PU activity and knowledge of the HMM parameters. 

It is easy to show that both the UR and the IR are nondecreasing functions of the threshold. 
Hence, the optimum threshold, i.e., the threshold that maximizes the UR under the constraint that 
the IR is less or equal to pmax, is therefore the largest threshold that satisfies the IR constraint 
with equality. However, to compute the optimum threshold from the CDF of Zk conditioned on 
that the future PU state qk+i = 1 is problematic since qk+i is not observable. The obvious method 
of (a) estimating the PU states during a training period with the forward backward method, (b) 
estimating the conditional CDF with an empirical CDF, and (c) finding the threshold from the 
ECDF and pmax, is numerically shown to be very close to optimum for all considered SNRs 

November 2, 2011 DRAFT 



23 



when the PU activity level is high, i.e., when the probability of PU transmission is high. In 
the opposite situation of a low PU activity level, the method is still close to optimum above a 
certain SNR, but fails for low SNRs in that the IR exceeds pmax- A method as the above, but 
based on the (unconditional) ECDF for z^, obviously avoids the need to estimate the PU states. 
Furthermore, this method is proven in Theorem |4] to never violate pmax, regardless of SNR and 
PU activity levels, but under certain conditions on the PU state transition probabilities, which 
are argued to be satisfied in practice. Numerical results show that the method works reasonably 
well when the PU activity level is high. However, the UR is very low compared to the optimum 
scheme when the PU activity level is low and the SNR is high. 

In summary, the paper presents practical methods for computing close to optimum thresholds 
in all cases, except when the SNR and the PU activity level are both low. In the latter case, 
we can still compute a threshold that respects Pmax> but with a significant loss in UR compared 
what is achievable with the optimum method. As an example of the former situation with a high 
PU activity level, our simulation showed of a 116% UR gain compared to the baseline method 
at an SNR of —3 dB and maximum IR level of 10%, when the LLR threshold was computed 
from estimated PU states. 
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